Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 120(48): e2308224120, 2023 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-37983496

RESUMO

The TnpB proteins are transposon-associated RNA-guided nucleases that are among the most abundant proteins encoded in bacterial and archaeal genomes, but whose functions in the transposon life cycle remain unknown. TnpB appears to be the evolutionary ancestor of Cas12, the effector nuclease of type V CRISPR-Cas systems. We performed a comprehensive census of TnpBs in archaeal and bacterial genomes and constructed a phylogenetic tree on which we mapped various features of these proteins. In multiple branches of the tree, the catalytic site of the TnpB nuclease is rearranged, demonstrating structural and probably biochemical malleability of this enzyme. We identified numerous cases of apparent recruitment of TnpB for other functions of which the most common is the evolution of type V CRISPR-Cas effectors on about 50 independent occasions. In many other cases of more radical exaptation, the catalytic site of the TnpB nuclease is apparently inactivated, suggesting a regulatory function, whereas in others, the activity appears to be retained, indicating that the recruited TnpB functions as a nuclease, for example, as a toxin. These findings demonstrate remarkable evolutionary malleability of the TnpB scaffold and provide extensive opportunities for further exploration of RNA-guided biological systems as well as multiple applications.


Assuntos
Bactérias , Ribonucleases , Ribonucleases/metabolismo , Filogenia , Bactérias/metabolismo , Archaea/metabolismo , Endonucleases/metabolismo , Sistemas CRISPR-Cas , RNA
2.
Nucleic Acids Res ; 51(15): 8150-8168, 2023 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-37283088

RESUMO

CRISPR-cas loci typically contain CRISPR arrays with unique spacers separating direct repeats. Spacers along with portions of adjacent repeats are transcribed and processed into CRISPR(cr) RNAs that target complementary sequences (protospacers) in mobile genetic elements, resulting in cleavage of the target DNA or RNA. Additional, standalone repeats in some CRISPR-cas loci produce distinct cr-like RNAs implicated in regulatory or other functions. We developed a computational pipeline to systematically predict crRNA-like elements by scanning for standalone repeat sequences that are conserved in closely related CRISPR-cas loci. Numerous crRNA-like elements were detected in diverse CRISPR-Cas systems, mostly, of type I, but also subtype V-A. Standalone repeats often form mini-arrays containing two repeat-like sequence separated by a spacer that is partially complementary to promoter regions of cas genes, in particular cas8, or cargo genes located within CRISPR-Cas loci, such as toxins-antitoxins. We show experimentally that a mini-array from a type I-F1 CRISPR-Cas system functions as a regulatory guide. We also identified mini-arrays in bacteriophages that could abrogate CRISPR immunity by inhibiting effector expression. Thus, recruitment of CRISPR effectors for regulatory functions via spacers with partial complementarity to the target is a common feature of diverse CRISPR-Cas systems.


Assuntos
Sistemas CRISPR-Cas , RNA , Sequências Repetitivas de Ácido Nucleico
3.
bioRxiv ; 2023 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-37090614

RESUMO

CRISPR- cas loci typically contain CRISPR arrays with unique spacers separating direct repeats. Spacers along with portions of adjacent repeats are transcribed and processed into CRISPR(cr) RNAs that target complementary sequences (protospacers) in mobile genetic elements, resulting in cleavage of the target DNA or RNA. Additional, standalone repeats in some CRISPR- cas loci produce distinct cr-like RNAs implicated in regulatory or other functions. We developed a computational pipeline to systematically predict crRNA-like elements by scanning for standalone repeat sequences that are conserved in closely related CRISPR- cas loci. Numerous crRNA-like elements were detected in diverse CRISPR-Cas systems, mostly, of type I, but also subtype V-A. Standalone repeats often form mini-arrays containing two repeat-like sequence separated by a spacer that is partially complementary to promoter regions of cas genes, in particular cas8 , or cargo genes located within CRISPR-Cas loci, such as toxins-antitoxins. We show experimentally that a mini-array from a type I-F1 CRISPR-Cas system functions as a regulatory guide. We also identified mini-arrays in bacteriophages that could abrogate CRISPR immunity by inhibiting effector expression. Thus, recruitment of CRISPR effectors for regulatory functions via spacers with partial complementarity to the target is a common feature of diverse CRISPR-Cas systems.

4.
CRISPR J ; 4(5): 656-672, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34582696

RESUMO

Type IV CRISPR-Cas are a distinct variety of highly derived CRISPR-Cas systems that appear to have evolved from type III systems through the loss of the target-cleaving nuclease and partial deterioration of the large subunit of the effector complex. All known type IV CRISPR-Cas systems are encoded on plasmids, integrative and conjugative elements (ICEs), or prophages, and are thought to contribute to competition between these elements, although the mechanistic details of their function remain unknown. There is a clear parallel between the compositions and likely origin of type IV and type I systems recruited by Tn7-like transposons and mediating RNA-guided transposition. We investigated the diversity and evolutionary relationships of type IV systems, with a focus on those in Acidithiobacillia, where this variety of CRISPR is particularly abundant and always found on ICEs. Our analysis revealed remarkable evolutionary plasticity of type IV CRISPR-Cas systems, with adaptation and ancillary genes originating from different ancestral CRISPR-Cas varieties, and extensive gene shuffling within the type IV loci. The adaptation module and the CRISPR array apparently were lost in the type IV ancestor but were subsequently recaptured by type IV systems on several independent occasions. We demonstrate a high level of heterogeneity among the repeats with type IV CRISPR arrays, which far exceed the heterogeneity of any other known CRISPR repeats and suggest a unique adaptation mechanism. The spacers in the type IV arrays, for which protospacers could be identified, match plasmid genes, in particular those encoding the conjugation apparatus components. Both the biochemical mechanism of type IV CRISPR-Cas function and their role in the competition among mobile genetic elements remain to be investigated.


Assuntos
Sistemas CRISPR-Cas/genética , Evolução Molecular , Proteobactérias/genética , Genes Bacterianos , Filogenia , Polimorfismo Genético , Proteobactérias/classificação
5.
Science ; 372(6541)2021 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-33926924

RESUMO

CRISPR-Cas systems provide RNA-guided adaptive immunity in prokaryotes. We report that the multisubunit CRISPR effector Cascade transcriptionally regulates a toxin-antitoxin RNA pair, CreTA. CreT (Cascade-repressed toxin) is a bacteriostatic RNA that sequesters the rare arginine tRNAUCU (transfer RNA with anticodon UCU). CreA is a CRISPR RNA-resembling antitoxin RNA, which requires Cas6 for maturation. The partial complementarity between CreA and the creT promoter directs Cascade to repress toxin transcription. Thus, CreA becomes antitoxic only in the presence of Cascade. In CreTA-deleted cells, cascade genes become susceptible to disruption by transposable elements. We uncover several CreTA analogs associated with diverse archaeal and bacterial CRISPR-cas loci. Thus, toxin-antitoxin RNA pairs can safeguard CRISPR immunity by making cells addicted to CRISPR-Cas, which highlights the multifunctionality of Cas proteins and the intricate mechanisms of CRISPR-Cas regulation.


Assuntos
Proteínas Associadas a CRISPR/fisiologia , Sistemas CRISPR-Cas/fisiologia , Haloarcula/fisiologia , RNA Arqueal/fisiologia , Sistemas Toxina-Antitoxina/fisiologia , Proteínas Associadas a CRISPR/genética , Sistemas CRISPR-Cas/genética , Análise Mutacional de DNA , Regulação da Expressão Gênica em Archaea , Haloarcula/genética , Óperon , RNA de Transferência de Arginina/metabolismo , Sistemas Toxina-Antitoxina/genética
6.
Nucleic Acids Res ; 49(4): e20, 2021 02 26.
Artigo em Inglês | MEDLINE | ID: mdl-33290505

RESUMO

CRISPR-Cas are adaptive immune systems that degrade foreign genetic elements in archaea and bacteria. In carrying out their immune functions, CRISPR-Cas systems heavily rely on RNA components. These CRISPR (cr) RNAs are repeat-spacer units that are produced by processing of pre-crRNA, the transcript of CRISPR arrays, and guide Cas protein(s) to the cognate invading nucleic acids, enabling their destruction. Several bioinformatics tools have been developed to detect CRISPR arrays based solely on DNA sequences, but all these tools employ the same strategy of looking for repetitive patterns, which might correspond to CRISPR array repeats. The identified patterns are evaluated using a fixed, built-in scoring function, and arrays exceeding a cut-off value are reported. Here, we instead introduce a data-driven approach that uses machine learning to detect and differentiate true CRISPR arrays from false ones based on several features. Our CRISPR detection tool, CRISPRidentify, performs three steps: detection, feature extraction and classification based on manually curated sets of positive and negative examples of CRISPR arrays. The identified CRISPR arrays are then reported to the user accompanied by detailed annotation. We demonstrate that our approach identifies not only previously detected CRISPR arrays, but also CRISPR array candidates not detected by other tools. Compared to other methods, our tool has a drastically reduced false positive rate. In contrast to the existing tools, our approach not only provides the user with the basic statistics on the identified CRISPR arrays but also produces a certainty score as a practical measure of the likelihood that a given genomic region is a CRISPR array.


Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Aprendizado de Máquina , Software , Genoma Arqueal , Genoma Bacteriano
7.
CRISPR J ; 3(6): 535-549, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33346707

RESUMO

CRISPR-Cas systems typically consist of a CRISPR array and cas genes that are organized in one or more operons. However, a substantial fraction of CRISPR arrays are not adjacent to cas genes. Definitive identification of such isolated CRISPR arrays runs into the problem of false-positives, with unrelated types of repetitive sequences mimicking CRISPR. We developed a computational pipeline to eliminate false CRISPR predictions and found that up to 25% of the CRISPR arrays in complete bacterial and archaeal genomes are located away from cas genes. Most of the repeats in these isolated arrays are identical to repeats in cas-adjacent CRISPR arrays in the same or closely related genomes, indicating an evolutionary relationship between isolated arrays and arrays in typical CRISPR-cas loci. The spacers in isolated CRISPR arrays show nearly as many matches to viral genomes as spacers from complete CRISPR-cas loci, suggesting that the isolated arrays were either functionally active recently or continue to function. Reconstruction of evolutionary events in closely related bacterial genomes suggests three routes of evolution of isolated CRISPR arrays: (1) loss of cas genes in a CRISPR-cas locus, (2) de novo generation of arrays from off-target spacer integration into sequences resembling the corresponding repeats, and (3) transfer by mobile genetic elements. Both combination of de novo emerging arrays with cas genes and regain of cas genes by isolated arrays via recombination likely contribute to functional diversification in CRISPR-Cas evolution.


Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Biologia Computacional/métodos , Edição de Genes/métodos , Bactérias/genética , Proteínas Associadas a CRISPR/genética , Sistemas CRISPR-Cas/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/fisiologia , Genoma Arqueal/genética , Genoma Bacteriano/genética , Genoma Viral/genética , Genômica/métodos , Filogenia
8.
Nat Commun ; 11(1): 3784, 2020 07 29.
Artigo em Inglês | MEDLINE | ID: mdl-32728052

RESUMO

The CRISPR-Cas are adaptive bacterial and archaeal immunity systems that have been harnessed for the development of powerful genome editing and engineering tools. In the incessant host-parasite arms race, viruses evolved multiple anti-defense mechanisms including diverse anti-CRISPR proteins (Acrs) that specifically inhibit CRISPR-Cas and therefore have enormous potential for application as modulators of genome editing tools. Most Acrs are small and highly variable proteins which makes their bioinformatic prediction a formidable task. We present a machine-learning approach for comprehensive Acr prediction. The model shows high predictive power when tested against an unseen test set and was employed to predict 2,500 candidate Acr families. Experimental validation of top candidates revealed two unknown Acrs (AcrIC9, IC10) and three other top candidates were coincidentally identified and found to possess anti-CRISPR activity. These results substantially expand the repertoire of predicted Acrs and provide a resource for experimental Acr discovery.


Assuntos
Bacteriófagos/genética , Proteína 9 Associada à CRISPR/antagonistas & inibidores , Aprendizado de Máquina , Análise de Sequência de Proteína/métodos , Proteínas Virais/genética , Archaea/genética , Archaea/virologia , Bactérias/genética , Bactérias/virologia , Proteína 9 Associada à CRISPR/genética , Sistemas CRISPR-Cas/genética , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Edição de Genes/métodos , Interações Hospedeiro-Parasita/genética , Homologia de Sequência de Aminoácidos
9.
Commun Biol ; 3(1): 321, 2020 06 22.
Artigo em Inglês | MEDLINE | ID: mdl-32572116

RESUMO

CRISPR arrays contain spacers, some of which are homologous to genome segments of viruses and other parasitic genetic elements and are employed as portion of guide RNAs to recognize and specifically inactivate the target genomes. However, the fraction of the spacers in sequenced CRISPR arrays that reliably match protospacer sequences in genomic databases is small, leaving the question of the origin(s) open for the great majority of the spacers. Here, we extend the spacer analysis by examining the distribution of partial matches (matching k-mers) between spacers and genomes of viruses infecting the given host as well as the host genomes themselves. The results indicate that most of the spacers originate from the host-specific viromes, whereas self-targeting is strongly selected against. However, we present evidence that the vast majority of the viruses comprising the viromes currently remain unknown although they are likely to be related to identified viruses.


Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Células Procarióticas/virologia , Viroma/genética , Adaptação Biológica/genética , Bactérias/genética , Bactérias/virologia , Escherichia coli/genética , Escherichia coli/virologia , Genoma , Interações Hospedeiro-Patógeno/genética , Provírus/genética
10.
CRISPR J ; 3(3): 156-163, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-33555973

RESUMO

The principal function of archaeal and bacterial CRISPR-Cas systems is antivirus adaptive immunity. However, recent genome analyses identified a variety of derived CRISPR-Cas variants at least some of which appear to perform different functions. Here, we describe a unique repertoire of CRISPR-Cas-related systems that we discovered by searching archaeal metagenome-assemble genomes of the Asgard superphylum. Several of these variants contain extremely diverged homologs of Cas1, the integrase involved in CRISPR adaptation as well as casposon transposition. Strikingly, the diversity of Cas1 in Asgard archaea alone is greater than that detected so far among the rest of archaea and bacteria. The Asgard CRISPR-Cas derivatives also encode distinct forms of Cas4, Cas5, and Cas7 proteins, and/or additional nucleases. Some of these systems are predicted to perform defense functions, but possibly not programmable ones, whereas others are likely to represent previously unknown mobile genetic elements.


Assuntos
Archaea/genética , Sistemas CRISPR-Cas , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Archaea/classificação , Archaea/metabolismo , Proteínas Arqueais/genética , Proteínas Arqueais/metabolismo , Endonucleases/genética , Genoma Arqueal , Metagenoma , Filogenia
11.
Nat Rev Microbiol ; 18(2): 67-83, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31857715

RESUMO

The number and diversity of known CRISPR-Cas systems have substantially increased in recent years. Here, we provide an updated evolutionary classification of CRISPR-Cas systems and cas genes, with an emphasis on the major developments that have occurred since the publication of the latest classification, in 2015. The new classification includes 2 classes, 6 types and 33 subtypes, compared with 5 types and 16 subtypes in 2015. A key development is the ongoing discovery of multiple, novel class 2 CRISPR-Cas systems, which now include 3 types and 17 subtypes. A second major novelty is the discovery of numerous derived CRISPR-Cas variants, often associated with mobile genetic elements that lack the nucleases required for interference. Some of these variants are involved in RNA-guided transposition, whereas others are predicted to perform functions distinct from adaptive immunity that remain to be characterized experimentally. The third highlight is the discovery of numerous families of ancillary CRISPR-linked genes, often implicated in signal transduction. Together, these findings substantially clarify the functional diversity and evolutionary history of CRISPR-Cas.


Assuntos
Archaea/genética , Bactérias/genética , Sistemas CRISPR-Cas/genética , Evolução Molecular , Regulação da Expressão Gênica em Archaea/fisiologia , Regulação Bacteriana da Expressão Gênica/fisiologia , Sistemas CRISPR-Cas/fisiologia
12.
Nat Protoc ; 14(10): 3013-3031, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31520072

RESUMO

Functionally linked genes in bacterial and archaeal genomes are often organized into operons. However, the composition and architecture of operons are highly variable and frequently differ even among closely related genomes. Therefore, to efficiently extract reliable functional predictions for uncharacterized genes from comparative analyses of the rapidly growing genomic databases, dedicated computational approaches are required. We developed a protocol to systematically and automatically identify genes that are likely to be functionally associated with a 'bait' gene or locus by using relevance metrics. Given a set of bait loci and a genomic database defined by the user, this protocol compares the genomic neighborhoods of the baits to identify genes that are likely to be functionally linked to the baits by calculating the abundance of a given gene within and outside the bait neighborhoods and the distance to the bait. We exemplify the performance of the protocol with three test cases, namely, genes linked to CRISPR-Cas systems using the 'CRISPRicity' metric, genes associated with archaeal proviruses and genes linked to Argonaute genes in halobacteria. The protocol can be run by users with basic computational skills. The computational cost depends on the sizes of the genomic dataset and the list of reference loci and can vary from one CPU-hour to hundreds of hours on a supercomputer.


Assuntos
Biologia Computacional/métodos , Genes Arqueais , Genes Bacterianos , Genômica/métodos , Sistemas CRISPR-Cas , Genoma Arqueal , Genoma Bacteriano , Anotação de Sequência Molecular/métodos , Fases de Leitura Aberta , Óperon
13.
Nat Rev Microbiol ; 17(8): 513-525, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31165781

RESUMO

The principal function of CRISPR-Cas systems in archaea and bacteria is defence against mobile genetic elements (MGEs), including viruses, plasmids and transposons. However, the relationships between CRISPR-Cas and MGEs are far more complex. Several classes of MGE contributed to the origin and evolution of CRISPR-Cas, and, conversely, CRISPR-Cas systems and their components were recruited by various MGEs for functions that remain largely uncharacterized. In this Analysis article, we investigate and substantially expand the range of CRISPR-Cas components carried by MGEs. Three groups of Tn7-like transposable elements encode 'minimal' type I CRISPR-Cas derivatives capable of target recognition but not cleavage, and another group encodes an inactivated type V variant. These partially inactivated CRISPR-Cas variants might mediate guide RNA-dependent integration of the respective transposons. Numerous plasmids and some prophages encode type IV systems, with similar predicted properties, that appear to contribute to competition among plasmids and between plasmids and viruses. Many prokaryotic viruses also carry CRISPR mini-arrays, some of which recognize other viruses and are implicated in inter-virus conflicts, and solitary repeat units, which could inhibit host CRISPR-Cas systems.


Assuntos
Sistemas CRISPR-Cas , Evolução Molecular , Transferência Genética Horizontal , Sequências Repetitivas Dispersas , Recombinação Genética , Archaea/genética , Bactérias/genética , Bacteriófagos/genética , Elementos de DNA Transponíveis , Plasmídeos
14.
RNA Biol ; 16(4): 435-448, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30103650

RESUMO

Trans-activating CRISPR (tracr) RNA is a distinct RNA species that interacts with the CRISPR (cr) RNA to form the dual guide (g) RNA in type II and subtype V-B CRISPR-Cas systems. The tracrRNA-crRNA interaction is essential for pre-crRNA processing as well as target recognition and cleavage. The tracrRNA consists of an antirepeat, which forms an imperfect hybrid with the repeat in the crRNA, and a distal region containing a Rho-independent terminator. Exhaustive comparative analysis of the sequences and predicted structures of the Class 2 CRISPR guide RNAs shows that all these guide RNAs share distinct structural features, in particular, the nexus stem-loop that separates the repeat-antirepeat hybrid from the distal portion of the tracrRNA and the conserved GU pair at that end of the hybrid. These structural constraints might ensure full exposure of the spacer for target recognition. Reconstruction of tracrRNA evolution for 4 tight bacterial groups demonstrates random drift of repeat-antirepeat complementarity within a window of hybrid stability that is, apparently, maintained by selection. An evolutionary scenario is proposed whereby tracrRNAs evolved on multiple occasions, via rearrangement of a CRISPR array to form the antirepeat in different locations with respect to the array. A functional tracrRNA would form if, in the new location, the antirepeat is flanked by sequences that meet the minimal requirements for a promoter and a Rho-independent terminator. Alternatively, or additionally, the antirepeat sequence could be occasionally 'reset' by recombination with a repeat, restoring the functionality of tracrRNAs that drift beyond the required minimal hybrid stability.


Assuntos
Sistemas CRISPR-Cas/genética , Evolução Molecular , Genômica , RNA Bacteriano/genética , Transativadores/genética , Bacteroides/genética , Sequência de Bases , Sequência Conservada/genética , Conformação de Ácido Nucleico , RNA Guia de Cinetoplastídeos/genética , Sequências Repetitivas de Ácido Nucleico/genética , Streptococcus/genética , Termodinâmica
15.
Proc Natl Acad Sci U S A ; 115(23): E5307-E5316, 2018 06 05.
Artigo em Inglês | MEDLINE | ID: mdl-29784811

RESUMO

The CRISPR-Cas systems of bacterial and archaeal adaptive immunity consist of direct repeat arrays separated by unique spacers and multiple CRISPR-associated (cas) genes encoding proteins that mediate all stages of the CRISPR response. In addition to the relatively small set of core cas genes that are typically present in all CRISPR-Cas systems of a given (sub)type and are essential for the defense function, numerous genes occur in CRISPR-cas loci only sporadically. Some of these have been shown to perform various ancillary roles in CRISPR response, but the functional relevance of most remains unknown. We developed a computational strategy for systematically detecting genes that are likely to be functionally linked to CRISPR-Cas. The approach is based on a "CRISPRicity" metric that measures the strength of CRISPR association for all protein-coding genes from sequenced bacterial and archaeal genomes. Uncharacterized genes with CRISPRicity values comparable to those of cas genes are considered candidate CRISPR-linked genes. We describe additional criteria to predict functionally relevance for genes in the candidate set and identify 79 genes as strong candidates for functional association with CRISPR-Cas systems. A substantial majority of these CRISPR-linked genes reside in type III CRISPR-cas loci, which implies exceptional functional versatility of type III systems. Numerous candidate CRISPR-linked genes encode integral membrane proteins suggestive of tight membrane association of CRISPR-Cas systems, whereas many others encode proteins implicated in various signal transduction pathways. These predictions provide ample material for improving annotation of CRISPR-cas loci and experimental characterization of previously unsuspected aspects of CRISPR-Cas system functionality.


Assuntos
Sistemas CRISPR-Cas/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Archaea/genética , Bactérias/genética , Sequência de Bases , Proteínas Associadas a CRISPR/genética , Simulação por Computador , Evolução Molecular , Genes Bacterianos , Testes Genéticos , Genoma Arqueal , Genoma Bacteriano , Filogenia
16.
mBio ; 8(5)2017 09 19.
Artigo em Inglês | MEDLINE | ID: mdl-28928211

RESUMO

Clustered regularly interspaced short palindromic repeats and CRISPR-associated protein (CRISPR-Cas) systems store the memory of past encounters with foreign DNA in unique spacers that are inserted between direct repeats in CRISPR arrays. For only a small fraction of the spacers, homologous sequences, called protospacers, are detectable in viral, plasmid, and microbial genomes. The rest of the spacers remain the CRISPR "dark matter." We performed a comprehensive analysis of the spacers from all CRISPR-cas loci identified in bacterial and archaeal genomes, and we found that, depending on the CRISPR-Cas subtype and the prokaryotic phylum, protospacers were detectable for 1% to about 19% of the spacers (~7% global average). Among the detected protospacers, the majority, typically 80 to 90%, originated from viral genomes, including proviruses, and among the rest, the most common source was genes that are integrated into microbial chromosomes but are involved in plasmid conjugation or replication. Thus, almost all spacers with identifiable protospacers target mobile genetic elements (MGE). The GC content, as well as dinucleotide and tetranucleotide compositions, of microbial genomes, their spacer complements, and the cognate viral genomes showed a nearly perfect correlation and were almost identical. Given the near absence of self-targeting spacers, these findings are most compatible with the possibility that the spacers, including the dark matter, are derived almost completely from the species-specific microbial mobilomes.IMPORTANCE The principal function of CRISPR-Cas systems is thought to be protection of bacteria and archaea against viruses and other parasitic genetic elements. The CRISPR defense function is mediated by sequences from parasitic elements, known as spacers, that are inserted into CRISPR arrays and then transcribed and employed as guides to identify and inactivate the cognate parasitic genomes. However, only a small fraction of the CRISPR spacers match any sequences in the current databases, and of these, only a minority correspond to known parasitic elements. We show that nearly all spacers with matches originate from viral or plasmid genomes that are either free or have been integrated into the host genome. We further demonstrate that spacers with no matches have the same properties as those of identifiable origins, strongly suggesting that all spacers originate from mobile elements.


Assuntos
Proteínas Associadas a CRISPR/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Genoma Arqueal , Genoma Bacteriano , Plasmídeos , Archaea/genética , Bactérias/genética , Sistemas CRISPR-Cas , Genoma Viral , Oligonucleotídeos/química , Oligonucleotídeos/genética , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...